9 research outputs found
Fair and Private Data Preprocessing through Microaggregation
Privacy protection for personal data and fairness in automated decisions are fundamental requirements for responsible Machine Learning. Both may be enforced through data preprocessing and share a common target: data should remain useful for a task, while becoming uninformative of the sensitive information. The intrinsic connection between privacy and fairness implies that modifications performed to guarantee one of these goals, may have an effect on the other, e.g., hiding a sensitive attribute from a classification algorithm might prevent a biased decision rule having such attribute as a criterion. This work resides at the intersection of algorithmic fairness and privacy. We show how the two goals are compatible, and may be simultaneously achieved, with a small loss in predictive performance. Our results are competitive with both state-of-the-art fairness correcting algorithms and hybrid privacy-fairness methods. Experiments were performed on three widely used benchmark datasets: Adult Income, COMPAS, and German Credit
Preprocessing Matters:Automated Pipeline Selection for Fair Classification
Improving fairness by manipulating the preprocessing stages of classification pipelines is an active area of research, closely related to AutoML. We propose a genetic optimisation algorithm, FairPipes, which optimises for user-defined combinations of fairness and accuracy and for multiple definitions of fairness, providing flexibility in the fairness-accuracy trade-off. FairPipes heuristically searches through a large space of pipeline configurations, achieving near-optimality efficiently, presenting the user with an estimate of the solutions’ Pareto front. We also observe that the optimal pipelines differ for different datasets, suggesting that no “universal best” pipeline exists and confirming that FairPipes fills a niche in the fairness-aware AutoML space.</p
Preprocessing Matters:Automated Pipeline Selection for Fair Classification
Improving fairness by manipulating the preprocessing stages of classification pipelines is an active area of research, closely related to AutoML. We propose a genetic optimisation algorithm, FairPipes, which optimises for user-defined combinations of fairness and accuracy and for multiple definitions of fairness, providing flexibility in the fairness-accuracy trade-off. FairPipes heuristically searches through a large space of pipeline configurations, achieving near-optimality efficiently, presenting the user with an estimate of the solutions’ Pareto front. We also observe that the optimal pipelines differ for different datasets, suggesting that no “universal best” pipeline exists and confirming that FairPipes fills a niche in the fairness-aware AutoML space.</p
Optimising fairness through parametrised data sampling
Improving machine learning models' fairness is an active research topic, with most approaches focusing on specific definitions of fairness. In contrast, we propose ParDS, a parametrised data sampling method by which we can optimise the fairness ratios observed on a test set, in a way that is agnostic to both the specific fairness definitions, and the chosen classification model. Given a training set with one binary protected attribute and a binary label, our approach involves correcting the positive rate for both the favoured and unfavoured groups through resampling of the training set. We present experimental evidence showing that the amount of resampling can be optimised to achieve target fairness ratios for a specific training set and fairness definition, while preserving most of the model's accuracy. We discuss conditions for the method to be viable, and then extend the method to include multiple protected attributes. In our experiments we use three different sampling strategies, and we report results for three commonly used definitions of fairness, and three public benchmark datasets: Adult Income, COMPAS and German Credit.</p
Optimising fairness through parametrised data sampling
Improving machine learning models' fairness is an active research topic, with most approaches focusing on specific definitions of fairness. In contrast, we propose ParDS, a parametrised data sampling method by which we can optimise the fairness ratios observed on a test set, in a way that is agnostic to both the specific fairness definitions, and the chosen classification model. Given a training set with one binary protected attribute and a binary label, our approach involves correcting the positive rate for both the favoured and unfavoured groups through resampling of the training set. We present experimental evidence showing that the amount of resampling can be optimised to achieve target fairness ratios for a specific training set and fairness definition, while preserving most of the model's accuracy. We discuss conditions for the method to be viable, and then extend the method to include multiple protected attributes. In our experiments we use three different sampling strategies, and we report results for three commonly used definitions of fairness, and three public benchmark datasets: Adult Income, COMPAS and German Credit.</p
The relationship between trust in AI and trustworthy machine learning technologies
To build AI-based systems that users and the public can justifiably trust one
needs to understand how machine learning technologies impact trust put in these
services. To guide technology developments, this paper provides a systematic
approach to relate social science concepts of trust with the technologies used
in AI-based services and products. We conceive trust as discussed in the ABI
(Ability, Benevolence, Integrity) framework and use a recently proposed mapping
of ABI on qualities of technologies. We consider four categories of machine
learning technologies, namely these for Fairness, Explainability, Auditability
and Safety (FEAS) and discuss if and how these possess the required qualities.
Trust can be impacted throughout the life cycle of AI-based systems, and we
introduce the concept of Chain of Trust to discuss technological needs for
trust in different stages of the life cycle. FEAS has obvious relations with
known frameworks and therefore we relate FEAS to a variety of international
Principled AI policy and technology frameworks that have emerged in recent
years.Comment: This submission has been accepted in ACM FAT* 2020 Conferenc